When answering natural language questions over knowledge bases (KBs), incompleteness in the KB can naturally lead to many questions being unanswerable. While answerability has been explored in other QA settings, it has not been studied for QA over knowledge bases (KBQA). We first identify various forms of KB incompleteness that can result in a question being unanswerable. We then propose GrailQAbility, a new benchmark dataset, which systematically modifies GrailQA (a popular KBQA dataset) to represent all these incompleteness issues. Testing two state-of-the-art KBQA models (trained on original GrailQA as well as our GrailQAbility), we find that both models struggle to detect unanswerable questions, or sometimes detect them for the wrong reasons. Consequently, both models suffer significant loss in performance, underscoring the need for further research in making KBQA systems robust to unanswerability.
translated by 谷歌翻译
最近显示外部眼睛照片显示出糖尿病性视网膜疾病和HBA1C升高的迹象。在本文中,我们评估外部眼睛照片是否包含有关其他系统性医疗状况的信息。我们开发了一个深度学习系统(DLS),该系统将外部眼睛的照片作为输入,并预测多个全身参数,例如与肝脏有关的参数(白蛋白,AST);肾脏(EGFR使用无种族的2021 CKD-EPI肌酐方程,尿液ACR);骨与矿物质(钙);甲状腺(TSH);和血数(HGB,WBC,血小板)。开发利用了49,015例糖尿病患者的151,237张图像,在加利福尼亚州洛杉矶县的11个地点接受糖尿病眼镜筛查。评估重点是9个预先指定的全身参数,并利用了3个验证集(a,b,c),涵盖了28,869名患有和没有糖尿病的患者,在加利福尼亚州洛杉矶县和大亚特兰大地区的3个独立地点进行了眼睛筛查。我们将结合了可用临床人口统计学变量的基线模型(例如年龄,性别,种族/种族,糖尿病年)进行了比较。相对于基线,DLS在检测AST> 36,钙<8.6,egfr <60,HGB <11,血小板<150,ACR> = 300和WBC <4时,在检测AST> 36,钙<8.6,Egfr <60,HGB <60,HGB <60,calcium <8.6,Egfr <60,calcium <8.6和wbc <4时,达到了统计学上的显着性能,并且类似于开发集的人口),其中DLS的AUC超过基线的AUC,增长了5.2-19.4%。在验证集B和C方面,与开发集相比,患者人群的差异很大,DLS的表现优于ACR> = 300的基线,而HGB <11升至7.3-13.2%。我们的发现提供了进一步的证据,表明外部眼睛照片包含跨越多器官系统的全身健康生物标志物。需要进一步的工作来研究这些生物标志物是否以及如何转化为临床影响。
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
Object movement identification is one of the most researched problems in the field of computer vision. In this task, we try to classify a pixel as foreground or background. Even though numerous traditional machine learning and deep learning methods already exist for this problem, the two major issues with most of them are the need for large amounts of ground truth data and their inferior performance on unseen videos. Since every pixel of every frame has to be labeled, acquiring large amounts of data for these techniques gets rather expensive. Recently, Zhao et al. [1] proposed one of a kind Arithmetic Distribution Neural Network (ADNN) for universal background subtraction which utilizes probability information from the histogram of temporal pixels and achieves promising results. Building onto this work, we developed an intelligent video surveillance system that uses ADNN architecture for motion detection, trims the video with parts only containing motion, and performs anomaly detection on the trimmed video.
translated by 谷歌翻译
The machine translation mechanism translates texts automatically between different natural languages, and Neural Machine Translation (NMT) has gained attention for its rational context analysis and fluent translation accuracy. However, processing low-resource languages that lack relevant training attributes like supervised data is a current challenge for Natural Language Processing (NLP). We incorporated a technique known Active Learning with the NMT toolkit Joey NMT to reach sufficient accuracy and robust predictions of low-resource language translation. With active learning, a semi-supervised machine learning strategy, the training algorithm determines which unlabeled data would be the most beneficial for obtaining labels using selected query techniques. We implemented two model-driven acquisition functions for selecting the samples to be validated. This work uses transformer-based NMT systems; baseline model (BM), fully trained model (FTM) , active learning least confidence based model (ALLCM), and active learning margin sampling based model (ALMSM) when translating English to Hindi. The Bilingual Evaluation Understudy (BLEU) metric has been used to evaluate system results. The BLEU scores of BM, FTM, ALLCM and ALMSM systems are 16.26, 22.56 , 24.54, and 24.20, respectively. The findings in this paper demonstrate that active learning techniques helps the model to converge early and improve the overall quality of the translation system.
translated by 谷歌翻译
We study the problem of planning under model uncertainty in an online meta-reinforcement learning (RL) setting where an agent is presented with a sequence of related tasks with limited interactions per task. The agent can use its experience in each task and across tasks to estimate both the transition model and the distribution over tasks. We propose an algorithm to meta-learn the underlying structure across tasks, utilize it to plan in each task, and upper-bound the regret of the planning loss. Our bound suggests that the average regret over tasks decreases as the number of tasks increases and as the tasks are more similar. In the classical single-task setting, it is known that the planning horizon should depend on the estimated model's accuracy, that is, on the number of samples within task. We generalize this finding to meta-RL and study this dependence of planning horizons on the number of tasks. Based on our theoretical findings, we derive heuristics for selecting slowly increasing discount factors, and we validate its significance empirically.
translated by 谷歌翻译
As language models have grown in parameters and layers, it has become much harder to train and infer with them on single GPUs. This is severely restricting the availability of large language models such as GPT-3, BERT-Large, and many others. A common technique to solve this problem is pruning the network architecture by removing transformer heads, fully-connected weights, and other modules. The main challenge is to discern the important parameters from the less important ones. Our goal is to find strong metrics for identifying such parameters. We thus propose two strategies: Cam-Cut based on the GradCAM interpretations, and Smooth-Cut based on the SmoothGrad, for calculating the importance scores. Through this work, we show that our scoring functions are able to assign more relevant task-based scores to the network parameters, and thus both our pruning approaches significantly outperform the standard weight and gradient-based strategies, especially at higher compression ratios in BERT-based models. We also analyze our pruning masks and find them to be significantly different from the ones obtained using standard metrics.
translated by 谷歌翻译
Neoplasms (NPs) and neurological diseases and disorders (NDDs) are amongst the major classes of diseases underlying deaths of a disproportionate number of people worldwide. To determine if there exist some distinctive features in the local wiring patterns of protein interactions emerging at the onset of a disease belonging to either of these two classes, we examined 112 and 175 protein interaction networks belonging to NPs and NDDs, respectively. Orbit usage profiles (OUPs) for each of these networks were enumerated by investigating the networks' local topology. 56 non-redundant OUPs (nrOUPs) were derived and used as network features for classification between these two disease classes. Four machine learning classifiers, namely, k-nearest neighbour (KNN), support vector machine (SVM), deep neural network (DNN), random forest (RF) were trained on these data. DNN obtained the greatest average AUPRC (0.988) among these classifiers. DNNs developed on node2vec and the proposed nrOUPs embeddings were compared using 5-fold cross validation on the basis of average values of the six of performance measures, viz., AUPRC, Accuracy, Sensitivity, Specificity, Precision and MCC. It was found that nrOUPs based classifier performed better in all of these six performance measures.
translated by 谷歌翻译
Customers are rapidly turning to social media for customer support. While brand agents on these platforms are motivated and well-intentioned to help and engage with customers, their efforts are often ignored if their initial response to the customer does not match a specific tone, style, or topic the customer is aiming to receive. The length of a conversation can reflect the effort and quality of the initial response made by a brand toward collaborating and helping consumers, even when the overall sentiment of the conversation might not be very positive. Thus, through this study, we aim to bridge this critical gap in the existing literature by analyzing language's content and stylistic aspects such as expressed empathy, psycho-linguistic features, dialogue tags, and metrics for quantifying personalization of the utterances that can influence the engagement of an interaction. This paper demonstrates that we can predict engagement using initial customer and brand posts.
translated by 谷歌翻译
Recently, e-scooter-involved crashes have increased significantly but little information is available about the behaviors of on-road e-scooter riders. Most existing e-scooter crash research was based on retrospectively descriptive media reports, emergency room patient records, and crash reports. This paper presents a naturalistic driving study with a focus on e-scooter and vehicle encounters. The goal is to quantitatively measure the behaviors of e-scooter riders in different encounters to help facilitate crash scenario modeling, baseline behavior modeling, and the potential future development of in-vehicle mitigation algorithms. The data was collected using an instrumented vehicle and an e-scooter rider wearable system, respectively. A three-step data analysis process is developed. First, semi-automatic data labeling extracts e-scooter rider images and non-rider human images in similar environments to train an e-scooter-rider classifier. Then, a multi-step scene reconstruction pipeline generates vehicle and e-scooter trajectories in all encounters. The final step is to model e-scooter rider behaviors and e-scooter-vehicle encounter scenarios. A total of 500 vehicle to e-scooter interactions are analyzed. The variables pertaining to the same are also discussed in this paper.
translated by 谷歌翻译